fix: Hardcode Legacy behavior to True to resolve warning. #446
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of the change
Proposing the change to set
Legacy=True
in theAutoTokenizer
. This will continue the same functionality ofsft_trainer.py
that we currently have, but it will remove this warning from appearing:More discussion can be found on the nature of the legacy behavior here: huggingface/transformers#24565.
Even when Legacy is not explicitly set, it is by default set to
True
by the tokenizer. Thus, this change will not change the current functionality.I have done some testing on the impact of Legacy behavior when tuning with
sft_triner
and have included my results below.Related issue number
Resolving a warning raised in Issue #1205.
How to verify the PR
Run tuning locally and verify that the warning message from above does not appear anymore.
Was the PR tested
I created an image for
Legacy = True
and one forLegacy = False
and tested it using the Travis CI flow. The changes were tested on llama3 and granite, using both LoRA tuning and Fine Tuning. I have graphed the results here:The F1 micro score was identical for both models when using Fine Tuning. When using LoRA tuning, both models showed a small improvement in F1 micro score when
Legacy
was set toTrue
. The difference is very small however and might subject to a margin of error while testing. We concluded that the results are pretty much the same regardless of whatLegacy
was set to.We also wondered whether the setting would change the EOS and BOS tokens, so I ran tuning locally and compared the tokenized outputs. The outputs were the same for both settings, at 1 epoch and at 5 epochs. I have included the tokenized output files below for comparison.
Legacy True 1 Epoch.txt
Legacy False 1 Epoch.txt
In conclusion, we determined that the impact of the
Legacy
setting on the tokenizer was negligible. We decided to keep the functionality the same as it is, but to hardcodeLegacy=True
to avoid the warning appearing.